Skip to content
This repository was archived by the owner on Aug 15, 2019. It is now read-only.

Conversation

@bkazi
Copy link

@bkazi bkazi commented Aug 12, 2018

Description

tensorflow/tfjs#102

Offscreen Canvas is currently only shipping in Chrome so this could break for users who try to use WebGL backend in a worker in unsupported browsers.
The same issue applies to tests hence the support check in the test

Add feature to check if in worker
Add types for offscreen canvas
Add test for canvas when running in worker


For repository owners only:

Please remember to apply all applicable tags to your pull request.
Tags: FEATURE, BREAKING, BUG, PERF, DEV, DOC, SECURITY

For more info see: https://github.com/tensorflow/tfjs/blob/master/DEVELOPMENT.md


This change is Reviewable

Add feature to check if in worker
Add types for offscreen canvas
Add test for canvas when running in worker
@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

@bkazi
Copy link
Author

bkazi commented Aug 12, 2018

I'm trying to port the mnist-core example to run in a worker, the plumbing between the worker is the tricky bit, but the main functionality seems to work fine
Can't confirm until it completely works
fingers crossed

@googlebot
Copy link

CLAs look good, thanks!

@bkazi
Copy link
Author

bkazi commented Aug 14, 2018

I have a few issues that I would like some discussion around

  • There is a lot of duplication of the IS_WORKER feature check in the previous commit since it is needed for multiple checks that rely on the canvas to figure out which kind of canvas to use. My idea to improve this was to abstract away the creation of canvases, and default to using OffscreenCanvas instead of the full canvas element, falling back to the canvas element on unsupported browsers. Does that seem too future facing for now?
  • OffscreenCanvas doesn't have TypeScript types since it hasn't shipped in many browsers so this means creating the types for the time being. As a quick fix I have created it such that the type is the same as that of HTMLCanvasElement since the API is the same. Do we continue with that or define a separate type for OffscreenCanvas and extend the class property to return type HTMLCanvasElement | OffscreenCanvas? I think the second one is the better option in the long run but just checking
  • the NUM_BYTES_BEFORE_PAGING is currently based on a screen size heuristic which isn't available in workers, is there a sensible default for this?

Fixes failing tests on node 10
TypeStrong/ts-node#615
@bkazi bkazi changed the title [WIP] Add support for using webgl in a worker via offscreen canvas Add support for using webgl in a worker via offscreen canvas Aug 22, 2018
@tafsiri tafsiri self-requested a review October 5, 2018 14:32
@tafsiri tafsiri self-assigned this Oct 5, 2018
@tafsiri
Copy link
Contributor

tafsiri commented Oct 5, 2018

@bkazi Sorry for the long delay in reviewing this, its the kind of change that requires a fair amount of background research for us to consider maintainability over time and we just didn't have the bandwidth recently. I will be taking a closer look at this though. One question I had is whether you have a proof-of-concept application using this functionality. You mentioned the mnist-core example. Is that available on github somewhere?

@bkazi
Copy link
Author

bkazi commented Oct 6, 2018

@tafsiri cheers, it is a fairly large change so any feedback is appreciated
I have a PR for minst-core using web workers at tensorflow/tfjs-examples#129 (or master of bkazi/tfjs-examples)

@tafsiri
Copy link
Contributor

tafsiri commented Oct 8, 2018

@bkazi Thanks, so i took a look at this and I wonder if a potentially simpler approach would be to feature test for OffscreenCanvas (rather than being in a webworker) and always use it when available. That way we would not need the IS_WORKER code path.

Something along the lines of

if (ENV.get('IS_BROWSER')) {
  if (window && ('OffscreenCanvas' in window)) {
    // console.log('USING OFFSCREEN CANVAS');
    this.canvas = new OffscreenCanvas(1, 1);
  } else {
    this.canvas = document.createElement('canvas');
  }
}

I tried this and the tests passed, there is another canvas element created that would also need to be replaced by an optional OffScreenCanvas as well.

But I think the advantage of this is that we have fewer code paths and can more easily test this code path without having it execute in an actual web worker in our tests.

There would still remain things to figure out like NUM_BYTES_BEFORE_PAGING, but we can start by setting that to infinity if there is no window object as we develop some helpful heuristics for workers.

How does that sound to you? I'll take a look at your tfjs-example today.

@tafsiri
Copy link
Contributor

tafsiri commented Oct 8, 2018

As an addendum to the comment above I also did some perf testing and there seemd to be no perf hit from using OffscreenCanvas compared to regular canvas.

@bkazi
Copy link
Author

bkazi commented Oct 18, 2018

hey sorry it's taken a while, I've been busy. This sounds great! I was leaning towards something like you suggested but avoided it because I wasn't sure if it would cause any regressions.
I'll make the change and see how it goes, and update this PR when I can

@tafsiri
Copy link
Contributor

tafsiri commented Dec 14, 2018

@bkazi Just thought I'd check in with you on how this is going. This is something we are interested in supporting internally so if you don't have bandwidth for this let us know and we can probably take a pass at it. Thanks!

@jeffreytgilbert
Copy link

@dsmilkov Offscreen Canvas + ImageBitmap + Workers = zero copy non-blocking gpu accelerated ML

@jeffreytgilbert
Copy link

@tafsiri looking at these edits, they're on a much older release of TensorFlow. On the most recent release of TensorFlow, things have moved around quite a bit, so the edits don't apply 1:1. Additionally, OffscreenCanvas inherits from EventTarget, not HTMLElement, so even though they are similar in respect to canvas calls, they aren't with respect to everything not event listener related. It's probably a better idea to support OffscreenCanvas directly by adding it to the TypeScript definitions and exposing it as a return type.

I'm new to this codebase as of today, so I don't know how much of TensorFlow is actually using canvas elements in place but never returning them. Hopefully, anything TF related is only using it as a proxy for performing ML work on the GPU. If that's the case and it's not returning the canvas element itself as part of the returned results, this ends up being way easier to do.

@jeffreytgilbert
Copy link

So, both the Canvas and OffscreenCanvas share EventTarget. If you return OffscreenCanvas as a default, and then anywhere that could or does need to return an actual canvas element, have that convert the OffscreenCanvas to Canvas on the fly, I think that works for everyone. If that's not a supported use case (returning canvas in tf results) then just always consider the canvas OffscreenCanvas when supported and never think of it as a HTMLCanvasElement.

@bkazi
Copy link
Author

bkazi commented Apr 16, 2019

@jeffreytgilbert the types for OffScreenCanvas I included were kinda duck typing because at the time TS didn't have OffScreenCanvas types, I assume because there wasn't much browser support. Maybe TS has added the types now?
As far as I understand tfjs only uses the canvas to access a webgl context to interact with the GPU and the library users don't have direct access to the canvas, and most of the library too interacts with the webgl context not the canvas. So the idea was to default to OffScreenCanvas where supported
I have been neglecting this PR, have a month left until I hand in my dissertation so I'll be glad to pick up the slack on this after that, if someone doesn't beat me to it

@jeffreytgilbert
Copy link

You also don't get User Interactive support in OffscreenCanvas. The detection logic around OffscreenCanvas couples the code to the global "window" object which precludes them from running in a Worker thread. One of the main goals of this should be allowing this code to run in a background process so it doesn't block the UI. Any coupling to scope not compatible with a worker environment is suspect.

https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Functions_and_classes_available_to_workers

@jeffreytgilbert
Copy link

TS hasn't added the types, but I'm going through and adding them. I'm not starting from scratch, but I'm starting from 1.0.4 trying to get this working for a proof of concept which allows my app to do head coupled perspective tracking similar to what was available in the headtrackr from years ago. TF added a lot of cool bells and whistles, but it's sooo slow it crushes the RAF performance.

My project is on my page called "goto-booth" if you're interested to see what I'm working on. The headtrackr project has a demo here: https://www.auduno.com/headtrackr/examples/targets.html

It works if you catch the first exception in debugger and then run these commands:
video.srcObject = stream; video.play();

What was cool about this project is that it's using a simpler ML engine, but it's already running on web workers and that code was written 9 years ago! Anyway, I'm making comments here as I go through the updates one by one. If I get something working, I'll consider pushing updates back to the project.

@tafsiri
Copy link
Contributor

tafsiri commented Apr 16, 2019

@jeffreytgilbert Yep a lot has moved around in the code base recently.

@bkazi is correct, the webgl backend of tfjs uses canvas primarily as a way to get a WebGL context, and it is not directly exposed to end users.

We will probably have more bandwidth to look at this ourselves sometime next month as we look to support more environments where document is not available. But anyone interested can jump in on it before hand (thanks @bkazi for the giving it a start!). Ideally this would be orthogonal (bot helpful for) code for getting tfjs working smoothly in workers.

@jeffreytgilbert
Copy link

Keeping this thread going, I don't think this code does what the original author thinks it does. backend_webgl.ts has a dispose method that runs ".remove()" on the canvas, but that method silently does nothing because the element is not attached to the dom/document and the method actually belongs to Element, not Canvas, and is intended to remove an element from its parent (in this case, there is none). This is important to note because OffscreenCanvas of course does not inherit from Element, so it will fail in this case.

I think what should probably happen is there should be a new Canvas created at this point.

@jeffreytgilbert
Copy link

There do not appear to be many worrisome references to "document" in TF. The only references I see are related to elements that aren't supported in workers anyway (images, videos, canvas elements). That's likely due to this already being designed to work in Node.js.

I do see references to "window" though. Those would need to get updated to check Worker​Global​Scope as a fallback for cases like in Engine.ts. There are some methods related to files that use window.URL.createObjectURL that wont be compatible with workers. I made the following edit in Engine.ts, and I don't believe there is anything else that should keep this from working within a web worker now.

let GLOBAL: { _tfengine: Engine };
function getGlobalNamespace(): { _tfengine: Engine } {
if (GLOBAL == null) {
// tslint:disable-next-line:no-any
let ns: any; // WindowOrWorkerGlobalScope
if (typeof WorkerGlobalScope !== 'undefined') {
ns = self;
} else if (typeof (window) !== 'undefined') {
ns = window;
} else if (typeof (global) !== 'undefined') {
ns = global;
} else if (typeof (process) !== 'undefined') {
ns = process;
} else {
throw new Error('Could not find a global object');
}
GLOBAL = ns;
}
return GLOBAL;
}

I'm going to try it out tomorrow. My plan is to run TF outside of a web worker, then run it inside a web worker. In both cases, there should be better performance, but I also want it to be off the main thread so even if it's still a bit slow, it doesn't block the RAF. I have to see how the face-api.js interface I'm using wraps, calls, and sends data to TF. I probably have to edit that library too so once TF is in a worker, I can use ImageBitmap to transfer video data over to it.

@jeffreytgilbert
Copy link

Latest Typescript looks like it has offscreencanvas definitions. https://github.com/Microsoft/TypeScript/blob/master/src/lib/webworker.generated.d.ts

Lib might need an upgrade. also, the latest version of TF wont build by default because it has TSC errors. This project must have an interesting release process I don't understand.

@tafsiri
Copy link
Contributor

tafsiri commented Apr 17, 2019

@jeffreytgilbert Looks like you are making progress! Wrt to typescript versions we pin to the version of ts used internally at Google (currently 3.3.3333). If you have build errors with that versions feel free to file an issue.

@jeffreytgilbert
Copy link

Updating to typescript@next and a liberal sprinkling of // @ts-ignore for the unrelated build issues, and I finally got it to compile with typescript. I've opened up a typescript bug ticket for the issues I found around the OffscreenCanvas support. That ticket is here: microsoft/TypeScript#30998

@jeffreytgilbert
Copy link

I was not able to get TF to work as expected due to incompatibilities with the existing code using typescript 3.5.0+. Those issues were insurmountable given the time I have to work on my pet project, but the updates i've posted in this thread should be valuable for anyone working on official support within the project.

I pushed the code changes I made up to a fork, but I abandoned trying to merge those in via a PR when something in VS Code kept autoformatting the whitespace on edited files on save, in spite of having all IDE settings telling it not to. Maybe this is someone else's config or a project config for TensorFlow? Not sure what it was picking up that was making it do that, but it made the edits really noisy and I didn't have time to debug it. If you can find the option on github to "ignore whitespace" on the diffs, checkout the edits here to see my work: https://github.com/jeffreytgilbert/tfjs-core/commits/master

Mainly, the things of note are that anything where I added @ts-ignore to the code, that was something that was broken for the project and/or typescript@3.5.0+/next which will have to be fixed. Anything else I edited, I did as a 1-off edit, not a shared utility function. The project is enormous. I wasn't worried too much about the extra if/else logic added. Also, there were a few spots where type hints had to be applied because of a bug in typescript where it would not be able to correctly pick the right function signature and map those to the correct returned data-type, so I had to do that for the library in those cases.

I'm interested in this project, so if someone else starts working on it, please feel free to @ me on here.

@jeffreytgilbert
Copy link

@tafsiri

Just a quick update. I was able to fool tensorflow into running within a worker. Canvas works as expected. I did not need to update the library once the worker was updated to appear as though it were a browser. Obviously, video and image support are not available, however Canvas processing works well.

I've only tested this on Chrome. It's possible Firefox doesn't work at all, and no other browsers support OffscreenCanvas at this time.

justadudewhohacks/face-api.js#47

@jebeck
Copy link

jebeck commented Apr 24, 2019

@jeffreytgilbert I've been following this discussion because we've been using a badly outdated hacked-in OffscreenCanvas in tfjs-core for running tfjs-tsne in a Worker. Can confirm that it works in Firefox 44+ with gfx.offscreencanvas.enabled flag set to true via about:config.

@jeffreytgilbert
Copy link

@jebeck my case was specific to Chrome (since it was a kiosk) and I did not put any time into validating Firefox works. I did put in a ton of time updating tfjs to natively use OffscreenCanvas, but ultimately that required a full rebuild of multiple components with a version of typescript yet to be released. Because of my limited time, instead I found it easier to polyfill a worker with the values tfjs expected and make sure I wasn't trying to send it unsupported media types, like video elements and image elements, instead passing it only canvas elements, image data, or image bitmaps. Once I had done this, it worked for chrome and was very fast. Software hurdles aside, what ended up being the most problematic was the integrated graphics chip (Intel Iris Pro Graphics 5200) on the old macbook pro i was using. It did not have enough power or shaders to be able to split up the load efficiently, and it ended up blocking the ui thread with unyielding calls to the limited GPU. Turns out GPU lock is just as real a concern as UI main thread event loop lock for CPUs.

The project was here if you want to take a look at the worker patches. You shouldn't need anything more complicated to get something working with background threads than simply mocking out the environmental requirements. https://github.com/jeffreytgilbert/goto-booth/

@dsmilkov
Copy link
Contributor

dsmilkov commented Aug 7, 2019

Closing this issue since support for Web Workers was already added. Thank you for the contribution though!

@dsmilkov dsmilkov closed this Aug 7, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants